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NON-TOXIC, NON-TOXIGENIC, NON-PATHOGENIC FUSARIUM 
EXPRESSION SYSTEM AND PROMOTERS AND TERMINATORS FOR USE 
THEREIN 

1. FIELD OF THE INVENTION 

The present invention relates to host cells useful in the production of 
recombinant proteins. In particular, the invention relates to non-toxic, non-toxigenic, and non- 
pathogenic fungal host cells of Fusarium which can be used in the high-level expression of 
recombinant proteins, especially enzymes. The invention further relates to promoter and 
terminator sequences which may be used in such a system. 

2. BACKGROUND OF THE INVENTION 

The use of recombinant host cells in the expression of heterologous proteins has 
in recent years greatly simplified the production of large quantities of commercially valuable 
proteins, which otherwise are obtainable only by purification from their native sources. 
Currently, there is a varied selection of expression systems from which to choose for the 
production of any given protein, including prokaryotic and eukaryotic hosts. The selection of 
an appropriate expression system will often depend not only on the ability of the host cell to 
Foduce adequate yields of the protein in an active state, but also to a large extent may be 
governed by the intended end use of the protein. 

Although mammalian and yeast cells have been the most commonly used 
eukaryotic hosts, filamentous fungi have now begun to be recognized as very useful as host 
cells for recombinant protein production. Examples of filamentous fungi which are currendy 
used or proposed for use in such processes are Newospora crassa, Acremonium 
chrysogenum, Tolypocladium geodes, Mucor circinelloides and Trichoderma reesei, 
Aspergillus mdulans, Aspergillus niger and Aspergillus oryzae. 

Certain species of the genus Fusarium have been used as model systems for the 
studies of plant pathogenicity and gene regulation such as Fusarium oxysporum (Diolez et aL 
1993, Gene 131:61-67; Langin et al., 1990, Curr. Genet 17:313-319; Malardier et aL, 1989,' 
Gene 78:147-156 and Kistler and Benny, 1988, Curr. Genet 13:145-149), Fusarium solani ' 
(Crowhurst et al., 1992, Curr. Genet 21:463-469), and Fusarium culmorum (Curragh et al., 
1992, Mycol. Res. 97:313-317). These Fusarium sp. would not be suitable commerciaUy for 
the production of heterologous proteins because of their undesirable characteristics such as 
being plant pathogens or because they produce unsafe levels of mycotoxin. Dickman and 
Leslie (1992, Mol. Gen. Genet 235:458-462) discloses the transformation of Gibberella zeae 
with a plasmid containing nit-2 of Newospora crassa. The strain of Gibberella zeae disclosed 
in Dickman and Leslie is a plant pathogen and produces zearalenone, an estrogenic mycotoxin. 
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Sanchez-Fernandez et al (1991, Mol. Gen. Genet 225:231-233) discloses the transformation 
of Gibberellafujikoroi carrying a maD mutation with a plasmid containing die Aspergillus niger 
niaD gene. 

An ideal expression system is one which is substantially free of protease and 
5 mycotoxin production, also substantially free of large amounts of other endogenously made 
secreted proteins, and which is capable of higher levels of expression than known host cells. 
The present invention now provides new Fusarium expression systems which fulfill these 
requirements. 

10 3 . SUMMARY OF THE INVENTION 

The present invention provides a recombinant non-toxic, non-toxigenic, and 
non-pathogenic Fusarium host cell comprising a nucleic acid sequence encoding a heterologous 
protein operably linked to a promoter. As defined herein, "non-toxic" means that the host cell 
does not act as a poison to plants or animals. For example, a Fusarium host cell would be 
considered non-toxic if about 14 days after injecting about 5 mice with a dose of about 20 ml of 
(1:1 diluted) 3 day old Fusarium culture medium/kg body wL/mouse, none of the mice died as 
aresultofFttsonum treatment As defined herein, "non-toxigenic" means that the host cells 
are essentially free of mycotoxin as determined by standard analytical methods such as HPLC 
analysis. For example, an amount of Fusarium grown on 2 x 9 cm petri dishes containing 
solid nutrient medium may be extracted with organic solvents and 0.5% of the extract may be 
injected into an HPLC for analysis. The absence of known mycotoxins would be inferred by 
the absence of detectable HPLC peaks at positions known for mycotoxin standards. As 
defined herein, "non-pathogenic" means that the host cells do not cause significant disease in 
healthy plants or healthy animals. For example, a Fusarium sp. that is pathogenic to plants can 
show a fungal invasion of the xylem tissue of the plant and result in the disease state 
characterized by typical wilt symptoms. As defined herein, a "heterologous protein" is a 
protein which is not native to the host cell, or a native protein in which modifications have been 
made to alter the native sequence or a native protein whose expression is quantitatively altered 
as a result of a manipulation of a native regulatory sequence required for the expression of the 
native protein, such as a promoter, a ribosome binding site, etc. or other manipulation of the 
host cell by recombinant DNA techniques. The nucleic acid sequence is operably linked to a 
suitable promoter sequence, which is capable of directing transcription of the nucleic acid 
sequence in the chosen host celL 

The invention also relates to a method for production of recombinant proteins, 
the method comprising culturing a host cell of one of the aforementioned species, which host 
cell contains a nucleic acid sequence encoding a heterologous protein, under conditions 
conducive to expression of the protein, and recovering the protein from the culture. In a 
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preferred embodiment, the protein is a fungal protein, most preferably a fungal enzyme. Using 
the method of the present invention, at least about 0.5 g heterologous protein/l host cell is 
produced. 

The host cell of the present invention secretes unexpectedly only low amounts 
of protease as determined by the casein clearing assay described in Section 6.1, infra; 
specifically only small or no zones of hydrolysis are detected. The host cells and methods of 
the present invention are unexpectedly more efficient in the recombinant production of certain 
fungal enzymes than are other known fungal species, such as Aspergillus niger Aspergillus 
oryzae, or Fusarium oxysporum. 

The invention further relates to a promoter sequence derived from a gene 
encoding a Fusarium oxysporum trypsin-like protease or a fragment thereof having 
substantially the same promoter activity as said sequence. The sequence of the promoter is 
shown in SEQ ID NO:5. 

Additionally, the invention relates to a terminator sequence derived from a gene 
encoding a Fusarium oxysporum trypsin-like protease or a fragment thereof having 
substantially the same terminator activity as said sequence. The sequence of the terminator is 
shown in SEQ ID NO:6. 

4. BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 shows an SDS gel of secreted proteins in Fusarium graminearum (lane 
1); Aspergillus niger (lane 2); and Aspergillus oryzae (lane 3). Lane 4 shows molecular weight 
markers. 

Figure 2 shows the results of a protease assay 
on the following samples: Aspergillus oryzae (well 1); Aspergillus niger (well 2); 

2 5 Fusarium graminearum (well 3); empty well controls (wells 4-6). 

Figure 3 shows the construction of plasmid pJRoy6. 

Figure 4 shows SDS-PAGE analysis of the secretion of a trypsin-like protease 
(SP387) in a transformant of F. graminearum 20334. Lane 1 : molecular size markers; lane 2: 
blank; lane 3: purified trypsin-like protease protein standard; lane 4; blank; lane 5: 
30 F. graminearum strain 20334 unttansformed; lane 6:blank; lane 7:F. graminearum strain 20334 
transformed with plasmid pJRoy6; lane 8: blank; line 9: molecular size markers. 

Figure 5 shows a restriction map of pJRoy20. 

Figure 6 shows a restriction map of pDM151. 

Figure 7 shows a restriction map of pDM155. 

3 5 Figures 8A and 8B show the level of expression of Carezyme® in Fusarium 

graminearum when DSM 151-4 is fomented in Fusarium graminearum from 20-160 hrs. 
Figure 8A shows the results of an assay for Carezyme®. Figure 8B shows SDS-PAGE 
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analysis of the production of Carezyme® in said Fusarium graminearum. Lane lanolecular 
size markers; lane 2:20 hrs.; lane 3:50 hrs.; lane 4:70 hrs.; lane 5:90 hrs.; land 6:120 hrs.; lane 
7:140 hrs.; lane 8:160 hrs. 

Figures 9A and 9B show the level of expression of Lipolase® when DSM 155- 
5 10 is fermented in Fusarium graminearum from 20-160 hrs. Figure 9A shows the results of an 
assay for Lipolase®. Figure 9B shows SDS-PAGE analysis of the production of Lipolase® in 
said Fusarium graminearum. Lane Lmolecular size markers; lane 2:20 hrs.; lane 3:50 hrs.; 
lane 4:60 hrs.; lane 5:90 hrs.; lane 6:120 hrs.; lane 7:140 hrs.; lane 8:160 hrs. 

Figure 10 shows a restriction map of pCaHj418. 
1 0 Figure 1 1 shows a restriction map of pDM148. 

Figure 12 shows a restriction map of pDM149. 

Figure 13 shows a restriction map of pMHan37. 

Figure 14 shows a restriction map of pDM154. 

15 5 . DETAILED DESCRIPTION OF THE INVENTION 

Fusarium are characterized by mycelium extensive and cotton-like in culture, 
often with some tinge of pink, purple or yellow in the mycelium on solid medium 
Conidiophores are variable slender and simple, or stout, short, branched irregularly or bearing 
a whorl of phialides, single or grouped into sporodochia. Conidia are principally of two kinds, 
often held in small moist heads: macroconidia several-celled, slightly curved or bent at the 
pointed ends, typically canoe-shaped and microconidia which are one celled, ovoid or oblong, 
borne singly or in chains. Some conidia are intermediate, 2 or 3 celled, oblong or slightly 
curved. 

In a specific embodiment, the host cells of the present invention are of the 
species Fusarium graminearum which is characterized by the Mowing features. Conidia: 
Microconidia are absent Macroconidia are distinctly septate, thick walled, straight to 
moderately sickle-shaped, unequally curved with the ventral surface almost straight and a 
smoothly arched dorsal surface. The basal cell is distinctly foot-shaped. The apical cell is 
cone-shaped or constricted as a snout Conidiophores: unbranched and branched 
monophialides. Chlamydospores:are generally very slow to form in culturerwhen they do 
occur, they most often form in the macroconidia but may also form in the mycelium. Colony 
morphology: on PDA, growth is rapid with dense aerial mycelium that may almost fill the tube 
and is frequently yellow to tan with the margins white to carmine red. Red-brown to orange 
sporodochia, if present are sparse, often appearing only when the cultures are more than 30 
days old. The undersurface is usually carmine red. This fungus produces the most cylindrical 
(dorsal and ventral surfaces parallel) macroconidia of any species of the section Discolor. 

In a most specific embodiment the Fusarium graminearum is Fusarium 
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graminearum Schwabe IMI 145425, deposited with the American Type Culture Collection and 
assigned the number ATCC 20334 (U.S. Patent No. 4,041,189), as well as derivatives and 
mutants which are similarly non-toxic, non-toxigenic, and non-pathogenic, e.g. those taught in 
U.S. Patent No. 4,041,189. 

5 It will be understood that throughout the specification and claims the use of the 

torn "Fusarium graminearum" refers not only to organisms encompassed in this species, but 
also includes those species which have previously been or currently are designated as other 
species in alternate classification schemes, but which possess the same morphological and 
cultural characteristics defined above, and may be synonymous to F. graminearum. These 
1 0 include but are not limited to Fusarium roseum, F. roseum var. graminearum, GibbereUa zeae, 
or GibbereUa roseum, Gibberella roseum f. sp. cerealis. 

The skilled artisan will also recognize that the successful transformation of the 
host species described herein is not limited to the use of the vectors, promoters, and selection 
markers specifically exemplified. Generally speaking, those techniques which are useful in 
transformation of F. oxysporum, F. solani and F. cubnorum are also useful with the host cells 
of the present invention. For example, although the amdS selection marker is preferred, other 
useful selection markers include the argB (A. nidulans or A. niger), trpC (A. niger or A 
niduians), pyrG (A niger, A. oryzae or A. nidulans), niaD (A. nidulans, A. niger, or F. 
oxysporum), and hygB (E. coli) markers. The promoter may be any DNA sequence that 
shows strong transcriptional activity in these species, and may be derived from genes encoding 
both extracellular and intracellular proteins, such as amylases, glucoamylases, proteases, 
Upases, celluloses and glycolytic enzymes. Examples of such promoters include but are not 
limited to A. nidulans amdS promoter or promoters from genes for glycolytic enzymes, e.g., 
TPI, ADH, GAPDH, and PGK. The promoter may also be a homologous promoter, i.e., the 
promoter for a gene native to the host strain being used. The promoter sequence may also be 
provided with linkers for the purpose of introducing specific restriction sites facilitating ligation 
of the promoter sequence with the gene of choice or with a selected signal peptide orpreregion. 

The promoter sequence may be derived from a gene encoding a Fusarium 
oxysporum trypsin-like protease or a fragment thereof having substantially the same promoter 
activity as said sequence. The sequence of the promoter is shown in SEQ ID N03. The 
invention further encompasses nucleic acid sequences which hybridize to the promoter 
sequence shown in SEQ ID NO:5 under the following conditiomipresoaking in 5X SSC and 
prehybridizing for 1 hr. at about 40'C in a solution of 20% formamide, 5X Denhardt's 
solution, 50 mM sodium phosphate, pH 6.8, and 50 ug denatured sonicated calf thymus DNA, 
Mowed by hybridization in the same solution supplemented with 100 uM ATP for 18 hrs at 
about 40-C, followed by a wash in 0.4X SSC at a temperature of about 45'C, or which have 
at least about 90% homology and preferably about 95% homology to SEQ ID NO:5, but which 
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have substantially the same promoter activity as said sequence. In another embodiment, the 
promoter may be a sequence comprising a large number of binding sites of AreA, a positive 
regulator of genes expressed during nitrogen limitation; these sites are referred to as nit-2 in 
Neurospora crassa (Fu and Marzlus, 1990, Proc. NatL Acad. Sci. U.S.A. 87:5331-5335). 
5 The promoter sequence may be modified by the addition or substitution of such AreA sites. 

Terminators and polyadenylation sequences may also be derived from the same 
sources as the promoters. In a specific embodiment, the terminator sequence may be derived 
from a gene encoding a Fusarium oxysporum trypsin-like protease or a fragment thereof 
having substantially the same terminator activity as said sequence. The sequence of the 

1 0 terminator is shown in SEQ ID NO:6. The invention further encompasses nucleic acid 
sequences which hybridize to the terminator sequence shown in SEQ ID NO:6 under the 
following conditions:presoaking in 5X SSC and prehybridizing for 1 nr. at about 40'C in a 
solution of 20% formamide, 5X Denhardt's solution, 50 mM sodium phosphate, pH 6.8, and 
50 ug denatured sonicated calf thymus DNA, followed by hybridization in the same solution 

1 5 supplemented with 100 uM ATP for 18 hrs. at about 40'C, Mowed by a wash in 0.4X SSC 
at a temperature of about 45'C, or which have at least about 90% homology and preferably 
about 95% homology to SEQ ID NO:5, but which have substantially the same terminator 
activity as said sequence. 

Enhancer sequences may also be inserted into the construct 
20 To avoid * e necessity of disrupting the cell to obtain the expressed product, 

and to rnuiimize the amount of possible degradation of the expressed product within the cell, it 
is preferred that the product be secreted outside the cell. To this end, in a preferred 
embodiment, the gene of interest is linked to a preregion such as a signal or leader peptide 
which can direct the expressed product into the cell's secretory pathway. The preregion may 
25 be derived from genes for any secreted protein from any organism, or may be the native 
preregion. Among useful available sources for such a preregion are a glucoamylase or an 
amylase gene from an Aspergillus species, an amylase gene from a Bacillus species, a lipase or 
proteinase gene from Rhizomucor miehei, the gene for the a-factor from Saccharomyces 
cerevisiae, or the calf prochymosin gene. The preregion may be derived from the gene for A. 
oryzae TAKA amylase, A. niger neutral a-amylase, A. niger acid stable a-amylase, B. 
licheniformis a-amylase, the maltogenic amylase from Bacillus NOB 11837, B. 
stearothermophilus a-amylase, or B. licheniformis subtilisdn. An effective signal sequence is 
the A. oryzae TAKA amylase signal, the Rhizomucor miehei aspartic proteinase signal and the 
Rhizomucor miehei lipase signal. As an alternative, the preregion native to the gene being 
35 expressed may also be used, e.g.,in SEQ ID NO:4 between amino acids -24 and -5. 
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Hie gene for the desired product functionally linked to promoter and terminator 
sequences may be incorporated in a vector containing the selection marker or may be placed on 
a separate vector or plasmid capable of being integrated into the genome of the host strain. 
Alternatively, the vectors used may be capable of replicating as linear or circular 
extrachromosomal elements in the host celL These types of vectors include for example, 
plasmids and mimchromosomes. The vector system may be a single vector or plasmid or two 
or more vectors or plasmids which together contain the total DNA to be integrated into the 
genome. Vectors or plasmids may be linear or closed circular molecules. 

The host cell may be transformed with the nucleic acid encoding the 
heterologous protein using procedures known in the art such as transformation and 
electroporation (see, for example, Fincham, 1989, Microbial Rev. 53:148-170). 

The recombinant host cell of the present invention may be cultured using 
procedures known in the art Briefly, the host cells are cultured on standard growth medium 
such as those containing a combination of inorganic salts, vitamins, a suitable organic carbon 
source such as glucose or starch, any of a variety of complex nutrients sources (yeast extract, 
hydrolyzed casein, soya bean meal, etc.). One example is FP-1 medium (5% soya bean meal 
5% glucose, 2% K 2 HP04, 0.2% CaCl 2 , 0.2% MgS0 4 7H 2 0 and 0.1% pluronic acid 
(BASF)). The fermentation is carried out at a pH of about 4.5-8.0, and at a temperature of 
about 20-37'C for about 2-7 days. 

The present host cell species can be used to express any prokaryotic or 
eukaryotic heterologous protein of interest, and is preferably used to express eukaryotic 
proteins. Of particular interest for these species is their use in expression of heterologous 
proteins, especially fungal enzymes. The novel expression systems can be used to express 
enzymes such as catalase, laccase, phenoloxidase, oxidase, oxidoreductases, cellulase 
xylanase, peroxidase, lipase, hydrolase, esterase, cutinase, protease and other proteolytic 
enzymes, aminopeptidase, carboxylase, phytase, lyase, pectinase and other pectinolytic 
enzymes, amylase, glucoamylase, a-galactosidase, 0-galactosidase, a-glucosidase, p- 
glucosidase, mannosidase, isomerase, invertase, transferase, ribonuclease, chitinase, mutanase 
and deoxyribonuclease. 

In a specific embodiment, the enzyme is an alkaline protease, e.g., a Fusarium 
oxysporum pre-pro-trypsin gene. In a most specific embodiment, the genomic sequence is 
shown in SEQ ID NO:3 and the protein sequence is shown in SEQ ID NO:4. 

In another specific embodiment, the enzyme is an alkaline endoglucanase 
which is immunologically reactive with an antibody raised against a highly purified "43 kD 
endoglucanase derived from Hwmcola insolens, DSM 1800, or which is a derivative of the "43 
kD endoglucanase exhibiting cellulase activity (cf. WO 91/17243). The endoglucanase, 
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hereinafter referred to as "Carezyme®" may be encoded by a gene shown in SEQ ID NO:7 and 
may have a protein sequence shown in SEQ ID NO:8. The enzyme may also be a Carezyme® 
variant 

In yet another specific embodiment, the enzyme is a 1,3-specific lipase, 
5 hereinafter referred to as Lipolase®. The enzyme may be encoded by the DNA sequence 
shown in SEQ ID NO:9 and may have an amino acid sequence shown in SEQ ID NO: 10. The 
enzyme may also be a Lipolase® variant, e.g., D96L, E210K, E210L (see WO 92/05249). 

It will be understood by those skilled in the art that the term "fungal enzymes" 
includes not only native fungal enzymes, but also those fungal enzymes which have been 
1 0 modified by amino acid substitutions, deletions, additions, or other modifications which may 
be made to enhance activity, thermostability, pH tolerance and the like. The present host cell 
species can also be used to express heterologous proteins of pharmaceutical interest such as 
hormones, growth factors, receptors, and the like. 

The presentinvention will be further illustrated by the Mowing non-limiting 

1 5 examples. 

6. EXAMPLES 

6.1. Fusarium graminearum 20334 Secretes Only a Low Level of 
20 Protein 

Conidial spore suspensions of Fusarium graminearum strain 20334, an 
A. oryzae, and A. niger are inoculated into 25 ml of YPD medium (1% yeast extract (Difco), 
2% bactopeptone (Difco), 2% glucose) in a 125 ml shake flask and incubated at 30°C at 300 
rpm for 5 days. Supernatant broths from the cultures are harvested by centrifugation. A total of 
10 ^1 of each sample are mixed with 10 ul 0.1 M dithiothreitol (Sigma) and 10 ul of loading 
buffer (40 mM Tris base, 6% sodium dodecyl sulfate, 2.5 mM EDTA, 15% glycerol, 2 mg/ml 
bromocresol purple). The samples are boiled for 5 minutes and run on a 4-12% polyacrylamide 
gel (Novex). The proteins are visualized by staining with Coomassie Blue. The results (Rgure 
1) show that Fusarium graminearum strain 20334 produces very little secreted protein. 

30 

6.2. Fusarium graminearum 20334 Secretes Only a Low Level of 
Proteases 

A total of 40 ^1 of culture broths fiom Fusarium graminearum strain 20334, 
A. oryzae, and A. niger (see Section 6.1., supra) are each pipetted into wells that are cut into a 
3 5 casein agar plate (2% non-fat dry milk (Lucerne), 50 mM Tris-HCl pH=7.5, 1% noble agar 
(Difco)). The plates are incubated at 37<C for 5 hours and the zones of protein hydrolysis are 
observed. The results (Figure 2) show that Fusarium graminearum strain 20334 broth contains 
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vay Jittle proteolytic activity. 

6.3. Cloning of Fusarium oxysporum Genomic Prepro-trypsin Gene 

A genomic DNA library in lambda phage is prepared from the F. oxysporum 
genomic DNA using methods such as those described found in Sambrook et al., 1989, 
Molecular Cloning, A Laboratory Manual Cold Spring Harbor, NY. A total of 50 pg genomic 
DNA are digested in a volume of 200 pi containing lOmMTris (pH=7.5), 50 mM Nad, 
7 mM Mgd 2 , 7 mM 2-mercaptoethanol, and 4 units restriction enzyme Sau3A for one minute 
at 37oC. Partially digested DNA of molecular size 10-20 kb is isolated by agarose gel 
electrophoresis, Mowed by electrocution into dialysis membrane and concentration using an 
Elutip-D column (Schleicher and Schuell). One pg of lambda arms of phage of EMBL4 that 
had been cut with restriction enzyme BamHl and treated with phosphatase (donetech) is 
Ugated with 300-400 pg Sau3A cut genomic DNA in a volume of 25 jd under standard 
conditions (see Sambrook etal., 1989, Molecular Cloning, A Laboratory Manual, Cold Spring 
Harbor, NY). Lambda phage are prepared from this ligation mix using a commercially 
available kit (Gigapack Gold E, Stratagene) following the manufacturers directions. 

The plating of ca. 15,000 recombinant lambda phage and the production of filter 
lifts (to Hy bond N+ filters, Amersham) arc performed using standard methods (Sambrook et 
a/., 1989, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor, NY). The filters are 
processed for hybridization with a Genius Kit for nonradioactive nucleic acids detection 
(Boehnnger Mannheim) using standard methods (Sambrook etal., 1989, Molecular Cloning 
A Laboratory Manual, Cold Spring Harbor, NY). The DNA used as a probe is a 0 75 kb 
digoxygenin (DIG) labeled PGR fragment of the entire coding region of the F. oxysporum 
trypsm-like protease (hereinafter referred to as SP387) gene present in plasmid pSX233 which 
has been deposited with the NRRL under the accession number of NRRL B-21241. The 
primers for the PCR reaction are 5'-tgcggatccATCKjTCAAGTTCGCTTCCGTC (forward 
Primer, SEQ ID NO:l) and 5'-gacctcgagTTAAGCATAGGTGTCAATGAA (reverse primer 
SEQ ID NO:2). In both primers, the lower case characters represent linker sequences and the 
upper case characters correspond to the coding region of the SP387 gene. To perform the 
PCR, 25 ng of a 907 bp BamHl/Xbal DNA fragment containing the SP387 gene from 
plasmid pSX233 are mixed with 68 pmoles of each forward and reverse primer. 

The mixture of the DNA fragment and primers is made up to an 80 pi volume in 
IX Taq Buffer/lX DIG labelling Mix/5 units Taq (Boehringer Mannheim). The reaction 
conditions are 95oC, 3 minutes, then 35 cycles of [95o C 30 seconds, 50o C 1 minute 72»C 1 
minute]. The DNA sequence derived by PCR from the F. oxysporum trypsin-like protease is 
shown in SEQ ID NO:3. The phage plaques are screened with the DIG labeled probe using a 
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modification (Engler and Blum, 1993, Anal. Biochem. 210:235-244) of the Genius kit 
(Boehringer Mannheim). Positive clones are isolated and purified by a second round of plating 
and hybridization. Recombinant lambda phage containing the F. oxysporum trypsin-like 
protease gene are prepared and DNA is isolated from the phage using a Quiagen lambda midi 
preparation kit (Quiagen). 

6.4. Construction of Expression Plasmid pJRoy6 

Restriction mapping, Southern blotting, and hybridization techniques 
(Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor, NY) 
are used to identify a 5.5 kb Pstl restriction enzyme fragment from one of the recombinant 
phage that contains the F. oxysporum trypsin-like protease coding gene and flanking DNA 
sequences. This 5.5 kb Pstl fragment is subcloned into Pstl digested pUCl 18 and the plasmid 
is designated pJRoy4 (see Figure 3). Plasmid pJRoy4 is digested with restriction enzyme 
EcoRl and a 3.5 kb EcoRl fragment containing the SP387 gene and the 43 bp EcoRl/Pstl 
region of the pUCl 18 polylinker is isolated and subcloned into the vector pToC90 to create 
plasmid pJRoy6 (Figure 3). 

6.5. Construction of SP387 Expression Cassette 

An expression cassette (pJRoy20) containing the SP387 promoter and 
terminator joined by a BamHl site in pUCl 18 is constructed. An E. coli strain containing 
pJRoy20 has been deposited with the NRRL. The promoter fragment is generated by 
digesting the SP387 vector pJRoy6 with EcoRl (which cuts at -1200) and with Ncol (which 
cuts at the translational start site, see Figure 5). The terminator sequence (bp 2056-3107 in 
Figure 5) is generated by PCR amplification using the following oligonucleotides: 

FORWARD 

5 , gcacaccatggtcgctggatccATACCTTGTTGGAAGCGTCG3 , (SEQID NO:ll) 
REVERSE 

5 ' atcggagcatgcggtaccgtttaaacgaattcAGGTAAACAAGATATAATITTCTG 3' (SEQID 
NO:12) 

Letters in large case are complementary to SP387 terminator DNA, while lower case letters are 
. tails containing engineered restriction sites. 

After digestion with Ncol and Sphl, the resulting amplification product 
containing the terminator flanked by Ncol and BamHl sites on the 5' end, and flanked by 
EcoRl, Pmel, Kpnl and Sphl sites on the 3' end is isolated. A 3-way ligation between the 
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promoter fragment, the terminator fragment and Kpnl/Sphl cut pUCl 18 is performed to 
generate pJRoy20 (see Figure 5). 

6.6. Carezyme® Constructs 

The EcoRV site at -15 in the SP387 promoter, and the Ncol site present at 
+243 in the Carezyme® coding region are utilized to create an exact fusion between the SP387 
promoter and the Carezyme® gene. A PCR fragment containing -18 to -1 of the SP387 
promoter direcdy followed by -1 to + 294 of the Carezyme® gene is generated from the 
Carezyme® vector P CaHj418 (see Figure 10) using the following primers: 

FORWARD 

EcoRV 

5'c^gatatctatctcttcaccATGCGTTCCTCCCCCCTC (SEQ ID NO: 13) 
REVERSE 

5'CAATAGAGGTGGCAGCAAAA 3' (SEQ ID NO:14) 

Lower case letters in the forward primer ar bp -24 to - 1 of the SP387 promoter, while upper 
case letters are bp 1 to 20 of Carezyme®. 

The PCR conditions used are:95°C, 5 min. followed by 30 cycles of 
[95°C,30sec., 50°C, 1 min., 72°C, 1 min.]. The resulting 0.32 kb fragment is cloned into 
vector pC^usmgmvitrogen'sTA cloning kit resulting in pDM148 (see Figure 11). The 
0.26 kb EcoRV/Ncol fragment is isolated from pDM148 and ligated to the 0.69 kb Ncol/Bgin 
fragment from P CaHj418 and cloned into EcoRV/BamHI digested P JRoy20 to create pDM149 
(see Figure 12). The 3.2 kb EcoRI Carezyme® expression cassette (SP387 
Fomoter/Carezyme®/SP387 terminator) is isolated frompDM149 and cloned into the EcoRI 
site of P ToC90 to create pDM151 (see Figure 6). Expression construct pDM151 contains both 
the expression cassette and the amdS selectable marker. An E. coli strain containing pDM151 
has been deposited with the NRRL. 

6.7. Lipolase® Constructs 

The EcoRV site at -15 in the SP387 promoter, and the Sacl site at +6 in the 
Lipolase® coding region are utilized to create an exact fusion between the SP387 promoter and 
the Lipolase® gene. An adapter containing the final 15 bp of the SP387 promoter followed by 
the first 6 bp of the Lipolase® coding region is constructed and is shown below. 
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25 
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35 



EcoRV SacI 

atctaictcttcaccATGAGGAGCT (SEQIDNO:15) 
tagatagagaagtggTACTCC (SEQIDNO:16) 

A 0.9 kb SacI/BamHI fragment of the Lipolase® cDNA gene is isolated from the Aoryzae 
expression construct pMHan37 (see Figure 13). The EcoRV/SacI adapter and SacI/BamHI 
Lipolase® fragment are ligated and cloned into EcoRV/Bamffl digested pJRoy20 to create 
plasmid pDM154 (see Figure 14). The 3.2 kb Kpnl Upolase® expression cassette (SP387 
promota/Lipolase®/SP387 tenninator) is isolated from pDM154 and cloned into the Kpnl site 
of P ToC90 to create plasmid pDM155 (see Figure 7). Expression construct pDM155 contains 
both the Lipolase® expression cassette and the amdS selectable marker. An E. coli strain 
containing pDM151 has been deposited with the NRRL. 

6.8. Transformation of F. graminearum 

Fusarium graminearum strain ATCC 20334 cultures are grown on 100 x 15 mm 
petri plates of Vogels medium (Vogel, 1964, Am. Nature 98:435-446) plus 1.5% glucose and 
1 .5% agar for 3 weeks at 25°C. Conidia (approximately 10» per plate) are dislodged in 10 ml 
of sterile water using a transfer loop and purified by filtration through 4 layers of cheesecloth 
and finally through one layer of miracloth. Conidial suspensions are concentrated by 
centrifugation. Fifty ml of YPG (1% yeast extract (Difco) 2% bactopeptone (Difco), 
2% glucose) are inoculated with 10» conidia, and incubated for 14 h at 20<>C, 150 rpm. 
Resulting hyphae are trapped on a sterile 0.4 \m filter and washed successively with sterile 
distilled water and 1.0 M MgS0 4 . The hyphae are resuspended in 10 ml of Novozym® 234 
(Novo Nordisk) solution (2-10 mg/ml in 1.0 M MgS0 4 ) and digested for 15-30 min at 34o C 
with agitation at 80 rpm Undigested hyphal material is removed from the resulting protoplast 
suspension by successive filtration through 4 layers of cheesecloth and through miracloth. 
Twenty ml of 1M sorbitol are passed through the cheesecloth and miracloth and combined with 
the protoplast solution. After mixing, protoplasts (approximately 5 x 108) are pelleted by 
centrifugation and washed successively by resuspension and centrifugation in 20 ml of 1M 
sorbitol and in 20 ml of STC (0.8 m sorbitol, 50 mM Tris-HCl pH=8.0, 50 mM CaCl 2 ). The 
washed protoplasts are resuspended in 4 parts STC and 1 part SPTC (0.8M sorbitol, 
40% polyethylene glycol 4000 (BDH), 50 mM Tris-HCl pH=8.0, 50 mM CaCla) at a 
concentration of 1-2 x 108/mL One hundred ul of protoplast suspension are added to 5 fig 
P JRoy6 and 5 Ml heparin (5 mg/ml in STC) in polypropylene tubes (17 x 100 mm) and 
incubated on ice for 30 min. One ml of SPTC is mixed gently into the protoplast suspension 
and incubation is continued at room temperature for 20 min. Protoplasts are plated on a 
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selective medium consisting of Cove salts (Cove, D J., 1966, Biochem. Biophys. 
Acta 113:51-56) plus 10 mM acetamide, 15 mM CsCl 2 , 2.5% noble agar (Difco) and 1.0 M 
sucrose using an overlay of the same medium with 0.6 M sucrose and 1.0% low melting 
agarose (Sigma). Plates are incubated at 25"C and transformants appeared in 6-21 days. 

6.9. Expression of trypsin-Iike protease in Fusarium graminearum 

Transformants are transferred to plates of COVE2 medium (same as COVE 
medium above without the cesium chloride and replacing the 1.0 M sucrose with a 
concentration of 30 g/l) and grown for 3 or more days at 25»C. Twenty five ml aliquots of FP- 
1 medium (5% soya bean meal, 5% glucose 2% K 2 HP0 4 , 0.2% Cad 2 , 0.2% MgS0 4 .7H 2 0 
and 0.1% pluronic acid (BASF)) in 150 ml flasks are inoculated with approximately 1 cm agar 
plugs from COVE2 plate cultures and incubated for 6 days at 30°C with agitation (150 rpm). 
Supernatant broth samples are recovered after centrifugation and subjected to SDS-PAGE 
analysis as follows. Thirty m of each broth is mixed with 10 \il SDS-PAGE sample buffer 
(1 ml 0.5 M Tris pH=6.8, 0.8 ml glycerol, 1.6 ml 10% SDS, 0.4 ml 0.8 M dithiothreitol, 
0.2 ml 1% bromophenol blue), 2 of 2% PMSF (Sigma) in isopropanol, and 2 m glycerol. 
The samples are placed in a boiling water bath for 4 minutes and 40^1 of each are run on a 
10-27% polyacrylamide gel (Novex). The gels are stained and destained with Coomassie dye 
using standard methods. The expression level of the trypsin-like protease has been determined 
tobe>0.5g/L 

6.10. Enzyme assays 
6.10.1. Carezyme® 

Buffer. Sodium phosphate (50 mM, pH 7.0) 

Substrate: AZCL-HE cellulose (Megazyme) at 2 mg/ml buffer 

Enzyme std: 100 mg of Carezyme® standard (10,070 ECU/g) is dissolved in 1 
ml buffer and stored at -20'C. This stock is diluted 1:100 in buffer immediately prior to use in 
enzymeassays. The assay range is 0.5 - 5.0 ECU/mL A conversion factor of 650,000 ECU/g 
Carezyme® is used. 

Substrate solution (990 pi) is added to sample wells of a 24-well microliter 
plate. Ten Ml of Carezyme® sample (diluted in buffer to produce activity of between 0.5 and 
10 ECU/ml.) are added to the substrate. Reactions are incubated for 30 minutes at 45'C with 



13 



WO 96/00787 PCT/US95/07743 

supernatant are transferred to a 96-well microtiter plate and the absorbance at 650 nm is 
measured. 

6.10.2. Lipolase® Assay 



Buffer 0.1M MOPS, pH 7.5 containing 4 mM CaCl 2 

Substrate: 10 mL p-nitrophenyl butyrate (pNB) 
inlmlDMSO; 

Add 4 ml buffer to substrate in DMSO 

♦Stock concentration = 1 1.5 mM in 20% DMSO 



Enzyme std:Lipolase® (23,100 LU/g) is dissolved at 

lOOOLU/ml in 50% glycerol and stored at -20°C 
This stock is diluted 1 : 100 in buffer 
immediately prior to assay. The assay range is 
0.125 to 3.0 LU/ml. 



100 nl pNB stock solution is added to 100 ul of appropriately diluted enzyme 
sample. Activity (mOD/min) is measured at 405 nm f or 5 min at 25°C. 

6.10.3. SP387 Assay 

L-BAPNA substrate is prepared by dilution of a 0.2 M stock solution of L- 
BAPNA (Sigma B3133) in dimethyl sulfoxide (stored frozen) to 0.004 M in buffer (0.01 M 
dimethylglutaric acid (Sigma), 0.2 M boric acid and 0.002 M calcium chloride, adjusted to pH 
6.5 with NaOH) just prior to use. One \H of culture was centrifuged (145000 x g, 10 min). A 
100 pi aliquot of diluted culture broth is added to 100 Hi substrate in a 96 well microtiter plate. 
Absorption change at 405 nm is assayed at 30 second intervals for 5 min. at 25'C using an 
EUSA reader. Results are calculated relative to a purified SP387 standard. 

6.11. Expression of Carezyme® 

Twenty-three transformants of pDM151 are purified, cultured in shake flasks 
on soy/glucose medium and assayed for Carezyme® activity after 9 days (Table 1-see below). 
Four transformants express Carezyme® at a level of approximately 50-100 mg/L. 
Transformant pDM151-4 is cultured in small scale fermentors using the conditions developed 
for SP387 production (see Section 6.9). Approximately 6.0 g/L of Carezyme® is evident after 
7 days (Figure 8A). Carezyme® comprised greater than 90% of secreted proteins based on 
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7 days (Figure 8A). Carezyme® comprised greater than 90% of secreted proteins based on 
SDS gei electrophoresis (Figure 8B). 

TABLE I 
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20 



Transformant ? 


ECU/ml 


mg/L- 


pOM 151.3 - 4 


53.2 


90 


pDM 151.3 - 5 


0 


0 


pDM 151.3 - 6 


0 


0 


pDM 151.3 - 10 


0 


0 


pOM 151.3 - 11 


2.48 


4 


pOM 151.3 - 12 


0 


0 


pOM 151.3 - 13 


12.2 


19 


pDM 151.3 - 14 


47.3| 


73 


pDM 151.3 - 15 


22.7 


35 


pDM 151.3 - 16 


0 


0 


pDM 151.3 - 17 


0 


0 


pDM 151.3 - 18 


0 


0 


nflM 151 3 - 19 


0 


0 


pDM 151.3 - 21 


0 


0 


pDM. 151.3 - 22 


43.7 


87 


pOM 1S1.3 - 23 


1.25 


2 


pDM 151.3 - 24 


17.8| 


27 


pDM 151.3 - 25 


36 


58 


pOM 151.3 - 26 


0 


0 


pOM 151.3 - 27 


10.5| 


1 6 


pDM 151.3-28 


49 .3| 


78 


pOM 151.3 - 29 


19.8| 


30 


pDM 151.3 - 30 


22.7| 


35 



25 

6.12. Expression of Lipolase® 

Fifteen transformants of pDM155 are purified, cultured in shake flasks in 
soy/glucose medium and assayed for Lipolase® activity after 9 days (Table 2-see next page). 

30 
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TABLE H 



5 



20 



Transformant # 


LU/ml Img/ml 


pOM 155 - 1 


669 


167 


pDM 155 - 2 


45.2 


1 1 


pDM 155 - 3 


180 


45 


pOM 155 - 4 


0 


0 


pOM 155 - 5 


55.4| 


14 


pDM 155-6 


116 


29 


pOM 155 - 7 


.704 


176 


pDM 155 - 8 


214 


54 


pOM 155 - 9 


17.1 


I 4 


pOM 155 - 10 


712| 173 


pOM 155-11 


511 


128 


pOM 155 - 12 


0 


0 


pDM 155 - 13 


0 


0 


pDM 155 - 14 


0| 0 


pOM 155-15 


153 


38 


pOM 155-16 


0 


0 


pOM 155 - 17 


0 


0 


pDM 155 - 18 


0| 0 


pOM 155 - 19 


129| 32 


pOM 155-20 


378| 95 


pOW 155-21 


216| 54 



Four transformants expressed Lipolase® at a level of approximately 100-200 mg/1 (based < 
the pNB assay). Transformant pDM155-10 is cultured in small scale fermentors using the 
conditions developed for SP387 production (see Section 6.9). Approximately 2.0 g/1 of 

2 5 Lipolase is evident after 7 days (Figure 8A). Lipolase® comprised greater than 90% of 

secreted proteins based on SDS gel electrophoresis (Figure 8B). 

7. DEPOSIT OF MICROORGANISMS 

The Mowing biological materials have been deposited in the Agricultural 

3 0 Research Service Patent Culture Collection (NRRL), Northern Regional Research Center, 

1815 University Street, Peoria, Illinois, 61604, USA. 

S£§in Accession No. Deposit Pat*. 

E. coli contaiiiiii- NRRL B-21285 6/20/94 
35 pJRoy6 

E. coli containing NRRLB-21418 3/10/95 
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E. coli containing NRRLB-21419 3/10/95 

pDM151 

E. coli containing NRRL B-21420 3/10/95 

pDM155 

The strains have been deposited under conditions that assure that access to the 
culture will be available during the pendency of this patent application to one determined by the 
Cfcirimissioner of Patents and Trademarks to be entitled thereto under 37 C.F.R. §1.14 and 35 
U.S.C. §122 and under conditions of the Budapest Treaty. The deposit represents a 
biologically pure culture of each deposited strain. The deposit is available as required by 
foreign patent laws in countries wherein counterparts of the subject application, or its progeny 
are tiled. However, it should be understood that the availability of a deposit does not constitute 
a license to practice the subject invention in derogation of patent rights granted by governmental 
action. 

The invention described and claimed herein is not to be limited in scope by the 
specific embodiments herein disclosed, since these embodiments are intended as illustrations of 
several aspects of the invention. Any equivalent embodiments are intended to be within the 
scope of this invention. Indeed, various modifications of the invention in addition to those 
shown and described herein will become apparent to those skilled in the art from the foregoing 
description. Such modifications are also intended to fall within the scope of the appended 
claims. 

Various references are cited herein, the disclosures of which are incorporated by 
reference in their entireties. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Novo Nordisk Biotech, Inc. 

(B) STREET: 1445 Drew Avenue, Ste. 105 

(C) CITY: Davis 

(D) STATE: California 

(E) COUNTRY: US 

(F) ZIP: 95616-4880 

(G) TELEPHONE: (916) 757-8100 

(H) TELEFAX: (916) 758-0317 

(ii) TITLE OF INVENTION: NON-TOXIC, NON- TOXIGENIC , NON-PATHOGENIC 
FUSARIUM EXPRESSION SYSTEM AND PROMOTERS AND TERMINATORS FOR USE THEREIN 

(iii) NUMBER OF SEQUENCES: 16 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Novo Nordisk of North America, Inc. 

(B) STREET: 405 Lexington Avenue, 64th Floor 

(C) CITY : New York 

(D) STATE: New York 

(E) COUNTRY: USA 

(F) ZIP: 10174-6401 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: to be assigned 

(B) FILING DATE: 15-June-1995 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/269,449 

(B) FILING DATE: 30-June-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/404,678 

(B) FILING DATE: 15-March-1995 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Agris Dr., Cheryl H. 

(B) REGISTRATION NUMBER: 34,086 

(C) REFERENCE/DOCKET NUMBER: 42 16. 2 04 -WO 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 212-867-0123 

(B) TELEFAX: 212-878-9655 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
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TGCGGATCCA TGGTCAAGTT CGCTTCCGTC 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
GACCTCGAGT TAAGCATAGG TGTCAATGAA 

(2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
ATCATCAACC ACTCTTCACT CTTCAACTCT CCTCTCTTGG ATATCTATCT CTTCACCATC 
GTCAAGTTCG CTTCCGTCGT TGCACTTGTT GCTCCCCTCG CTCCTGCCGC TCCTCAGGAG 
ATCCCCAACA TTGTTGGTGG CACTTCTGCC AGCGCTGGCG ACTTTCCCTT CATCGTGAGC 
ATTAGCCGCA ACGGTGGCCC CTGGTGTGGA GGTTCTCTCC TCAACGCCAA CACCGTCTTC 
ACTGCTGCCC ACTGCGTTTC CGGATACGCT CAGAGCGGTT TCCAGATTCG TGCTCGCAGT 
CTGTCTCGCA CTTCTGGTGG TATTACCTCC TCGCTTTCCT CCGTCAGAGT TCACCCTAGC 
TACAGCGGAA ACAACAACGA TCTTGCTATT CTGAAGCTCT CTACTTCCAT CCCCTCCGGC 
GGAAACATCG GCTATGCTCG CCTGGCTGCT TCCGGCTCTG ACCCTCTCGC TGGATCTTCT 
GCCACTGTTG CTGGCTGGGG CGCTACCTCT GAGGGCGGCA GCTCTACTCC CGTCAACCTT 
CTGAAGGTTA CTGTCCCTAT CGTCTCTCGT GCTACCTCCC GAGCTCAGTA CGGCACCTCC 
GCCATCACCA ACCAGATGTT CTGTGCTGGT GTTTCTTCCG GTGGCAAGGA CTCTTCCCAG 
GGTGACAGCG GCGGCCCCAT CGTCGACAGC TCCAACACTC TTATCGGTGC TGTCTCTTCG 
GGTAACGGAT GTGCCCGACC CAACTACTCT GGTGTCTATG CCAGCGTTGG TCCTCTCCGC 
TCTTTCATTG ACACCTATGC TTAAATACCT TGTTGGAAGC GTCGAGATGT TCCTTCAATA 
TTCTCTAGCT TGAGTCTTGG ATACGAAACC TGTTTGAGAA ATAGGTTTCA ACGAGTTAAG 
AAGATATGAG TTGATTTCAG TTGGATCTTA GTCCTGGTTG CTCGTAATAG AGCAATCTAG 
ATAGCCCAAA TTGAATATGA AATTTGATGA AAATATTC 

(2) INFORMATION FOR SEQ ID NO: 4: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 248 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

19 
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(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..224 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: -24.. 0 

(D) OTHER INFORMATION: /products - OTHER- 
/note= B Label=pre-propeptide° 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Val Lys Phe Ala Ser Val Val Ala Leu Val Ala Pro Leu Ala Ala 
"20 .is _ 10 

Ala Ala Pro Gin Glu He Pro Asn He Val Gly Gly Thr Ser Ala Ser 
~ 5 1 5 

Ala Gly Asp Phe Pro Phe He Val Ser He Ser Arg Asn Gly Gly Pro 
10 15 20 

Trp Cys Gly Gly Ser Leu Leu Asn Ala Asn Thr Val Leu Thr Ala Ala 
25 30 35 40 

His Cys Val Ser Gly Tyr Ala Gin Ser Gly Phe Gin He Arg Ala Gly 
« 50 55 

Ser Leu Ser Arg Thr Ser Gly Gly He Thr Ser Ser Leu Ser Ser Val 
6 ° 65 N 70 

Arg Val His Pro Ser Tyr Ser Gly Asn Asn Asn Asp Leu Ala He Leu 
75 80 85 

Lys Leu Ser Thr Ser He Pro Ser Gly Gly Asn He Gly Tyr Ala Arg 

Leu Ala Ala Ser Gly Ser Asp Pro Val Ala Gly Ser Ser Ala Thr Val 
105 110 115 120 

Ala Gly Trp Gly Ala Thr Ser Glu Gly Gly Ser Ser Thr Pro Val Asn 
12 * 130 135 

Leu Leu Lys Val Thr Val Pro He Val Ser Arg Ala Thr Cys Arg Ala 
140 145 150 

Gin Tyr Gly Thr Ser Ala He Thr Asn Gin Met Phe Cys Ala Gly Val 
155 160 165 

Ser Ser Gly Gly Lys Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro He 
1/u 175 180 

Val Asp Ser Ser Asn Thr Leu He Gly Ala Val Ser Trp Gly Asn Gly 

Cys Ala Arg Pro Asn Tyr Ser Gly Val Tyr Ala Ser Val Gly Ala Leu 
205 210 215 

Arg Ser Phe He Asp Thr Tyr Ala 
220 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1206 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
GAATTCTTAC AAACCTTC AA CAGTGGAGAC TTCCGACACG ACATATCGAT CCTTTGAAGA 
TACGGTGAGC GTCAGATCAT GAATTTCATA CATCCTCACG TCCTTCCTCT TTCAAACTAT 
GCAAAGTCCT TCTAGTACCT CCCAAAACTT GATTTACGCG CTCTCCAATC AAAAGTACCT 
TCCAAAAGTG ATCTACCTCA GCTCTAGATC AGGGCACCTA TTCGCAAAGA TCTACAAGCT 
GAACTAGTAA GCATAGCGGG AGAATATCCC ACATCATTCG AGAAGGCCTT CGTATTAGAC 
CTAGTGGGAT CGACAGAAAA GATAAGACGG AGATAGATCC TATGTTTGGA AGGTAGGGGA 
TGGAATAGGA TGCAACAGGT ATTGGCATAA GCGATGCAAT AGGTGCATCT AGAAACTAGG 
TGACAGACTG GCCACAGAGG TGTATCCTAT GCAGGTCGAT GCGTGCGTTA TCGCAGGGCT 
GCTATTGCGT GGTGGTGGCT ACAAAAGTTC TATGTGGTTT CCAGTTTCAG AATATTCGGC 
CA1TGTGATT GATGGCGCAT GACCGAATTA TAGCAGTGAA CCCCGCCCAG AGTAGTAGTG 
CAGATGCGCT TTGATGCTTG GCGATTCCTC GGGCTAAATA ACTCCGGTTG GTCTGTAGAA 
TGCTGACGCG ATGATCCTTC GGCATTAATC GTAGATCTTG GGGGGGGATA AGCCGATCAA 
AGACACACTG TAGATCAGCT CTTCGATGAC TCTTACCAGC TTTATAATAA CATTCATCTT 
GAACGTCTTT TTCGTCCAGT GTTTACCTTT CGTCCTATTT ATCCGTCATA TCCACAGTGT 
TATTGGCGAT AGAGTTATCG ACTTTCCTCA TCGGGATACT -GGCCCCTGCT GCCAAGGGCC 
TTATATGCCG ATCACTTTCA CGGGAGCATG ATAAGGTTAA TGCTTCTTCT GAATCCCGAA 
CTAGACTACG GAACAACGGA GCTTAGTACC AGAAAGGCAG GTACGCCTAT TCGCAAACTC 
CGAAGATACA ACCAAGCAAG CTTATCGCGG GATAGTAACC AGAGAGGCAG GTAAGAAGAC 
ACAACAACAT CCATAGCTAT GTAGATTCTC GAATATAAAA GGACCAAGAT GGACTATTCG 
AAGTAGTCTA TCATCAACCA CTCTTCACTC TTCAACTCTC CTCTCTTGGA TATCTATCTC 
TTCACC 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1188 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TAAATACCTT GTTGGAAGCG TCGAGATGTT CCTTGAATAT TCTCTAGCTT GAGTCTTCGA 
TACGAAACCT GTTTGAGAAA TAGGTTTCAA CGAGTTAAGA AGATATCAGT TGATTTCAGT 
TGGATCTTAG TCCTGGTTGC TCGTAATAGA GCAATCTAGA TAGCCCAAAT TCAATATCAA 
ATTTGATGGA AATATTCATT TCGATAGAAG CAACGTGAAA TGTCTAGCAG GACGAAAAGT 



60 
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540 
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720 
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1020 
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1140 
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AGATCAAGGC TGTTATGTTC CCCGACCAAC CTACCTTGAT GTCAGTCTGC GAGTCGTCTG 300 

CAGTGACCCA GAATGATGGA TTGACTTGGA CATTTTCTGT CTATGAAGTA TTATGAACAT 360 

GAATATCGTT TCCTCATTAT CTATGTTGGC AGCCTAAAGT TTTACCATAT AGCTAGCAAT 420 

CAGTCAAGTA TCTCCGTATG AAGGGTTGTT AAGCCAGGAC GGTATCAGCG TTGAATATTT 480 

AAAGAATGAT ATGAGATAAT CAACATTGAC ATGATAAAAG AAAAGGGGAA ACAAATTCTC 540 

CATATAGTAA AGACTTCAGG TCGACCCCTC AATAGACATA TGCGAACCGA AAACCAACAG 600 

GATACAATTT ATAGATAAGT ATAACTACAG TTATCTGTCT GCCGAACAAA TACTCTTTTG 660 

TGAAACAAAT GAAGAGTACA TAAGCTACAG TTCCTCAGTA GGAACATCCT TTACAATAAC 720 

TCCCTTGACT TCCTTCAGCT TCTCAATAGC CTCCAAAGTC ATCGGTCTGC CATCAAGGCA 780 

CGTCAGCTCT GGTGTAGCAT ACAGCAGTGC CATACTTACG GAGGATAGGA AGTGGGAGGA 840 

ATCGTTCGTG TCTGCCTCCA AAAATCGACA CCAGTGTCCT TTTTGACGAT ACTGATATGG 900 

TGGTAAGCTT GGGAGTCTAT TGTTGACGTT GCATCACTTA CTTAAGCACG GTTTCATTCC 960 

TCTGCTGATA GTCCTCCAAC TTCTCGAAGT CGTAAACGAT GGCCTATAGT ATCTTATTGA 1020 

GAAATATGTC TTCTCAGAAA ATTATATCTT GTTTACCTTT CGGTCCGCCA TGGCTGCTAA 1080 

AACTGCTGGG AAATTCAAAA GCGCAGCACA AGCAGCAAGA GTGATGGGCA CAACGTGATA 1140 

TGTTGATAAA AGCATCAGTA TCGATAAGTT CCACTCAGAA ACCTCCAG 1188 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 10.. 924 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 73.. 924 

(ix) FEATURE: 

(A) NAME/KEY: sig_peptide 

(B) LOCATION: 10.. 72 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

GGATCCAAG ATG CGT TCC TCC CCC CTC CTC CCG TCC GCC GTT GTG GCC 

-21 -20 L i5 Pr ° Ala Val Val Ala 

GCC CTG CCG GTG TTG GCC CTT GCC GCT GAT GGC AGG TCC ACC CGC TAC 
Ala Leu Pro Val Leu Ala Leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr 

TGG GAC TGC TGC AAG CCT TCG TGC GGC TGG GCC AAG AAG GCT CCC GTG 144 
Trp Asp Cys Cys Lys Pro Ser Cys Gly Trp Ala Lys Lys Ala Pro Val 
10 15 20 
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AAC CAG CCT GTC TTT TCC TGC AAC GCC AAC TTC CAG CGT ATC ACG GAC 192 
Asn Gin Pro Val Phe Ser Cys Asn Ala Asn Phe Gin Arg lie Thr Asp 
25 30 35 " 40 

TTC GAC GCC AAG TCC GGC TGC GAG CCG GGC GGT GTC GCC TAC TCG TGC 240 
Phe Asp Ala Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys 
45 50 55 

GCC GAC CAG ACC CCA TGG GCT GTG AAC GAC GAC TTC GCG CTC GGT TTT 288 
Ala Asp Gin Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Leu Gly Phe 
60 65 70 

GCT GCC ACC TCT ATT GCC GGC AGC AAT GAG GCG GGC TGG TGC TGC GCC 336 
Ala Ala Thr Ser He Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala 
75 80 85 

21 ^ 2E EE £S 55 £ A ^ C 5? T ?* OTT GCT GGC AAG AAG ATG 384 

100 



A. * Z., r ivt. ugt CCT GTT GCT GGC AAG AAG ATG 

Cys Tyr Glu Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Ly? Met 
90 95 ioo 

vJ? v!? 1°° ££ C AGC ACT 000 007 GAT CTT GGC AGC AAC CAC TTC 432 
Val Val Gin Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Phe 

105 110 115 120 

GAT CTC AAC ATC CCC GGC GGC GGC GTC GGC ATC TTC GAC GGA TGC ACT 
Asp Leu Asn lie Pro Gly Gly Gly Val Gly He Phe Asp Gly Cys Thr 



135 



205 210 



215 



ACC AGC TCT CCG GTC AAC CAG CCT ACC AGC ACC AGC ACC ACG TCC ACC 
Thr Ser Ser Pro Val Asn Gin Pro Thr Ser Thr Ser J£r S IS 
220 225 230 

TCC ACC ACC TCG AGC CCG CCA GTC CAG CCT ACG ACT CCC AGC GGC TGC 
Ser Thr Thr Ser Ser Pro Pro Val Gin Pro Thr S Pro Jer Sty Ss 
235 240 245 



480 



CCC CAG TTC GGC GGT CTG CCC GGC CAG CGC TAC GGC GGC ATC TCG TCC coo 
Pro Gin Phe Gly Gly Leu Pro Gly Gin Arg Tyr ciy ciy J£ 22 
140 145 150 

CGC AAC GAG TGC GAT CGG TTC CCC GAC GCC CTC AAG CCC GGC TGC TAC 
Arg Asn Glu Cys Asp Arg Phe Pro Asp Ala Leu Lys Pro Gly Cys Tyr 
155 160 165 

TGG CGC TTC GAC TGG TTC AAG AAC GCC GAC AAT CCG AGC TTC AGC TTC 
Trp Arg Phe Asp Trp Phe Lys Asn Ala Asp Asn Pro ter SS Sir 
170 175 180 

CGT CAG GTC CAG TGC CCA GCC GAG CTC GTC GCT CGC ACC GGA TGC CGC e.10 
Arg Gin Val Gin Cys Pro Ala Glu Leu Val Ala Arg Thr G?y Cys Arg ™ 

yo 195 200 

CGC AAC GAC GAC GGC AAC TTC CCT GCC GTC CAG ATC CCC TCC AGC AGC 
Arg Asn Asp Asp Gly Asn Phe Pro Ala Val Gin iS Pro Tel sS Jer 



576 



624 



720 



768 



816 



ita Sa tin S S S 1 £ A ° ^° S? C GGC AAT GGC TGG AGC GGC TGC 
Thr Ala Glu Arg Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys 
* 3U 255 260 

ACC ACC TGC GTC GCT GGC AGC ACT TGC ACG AAG ATT AAT GAC TGG TAC 
Thr Thr Cys Val Ala Gly Ser Thr Cys Thr Lys S j£n Sp Trp T^r 

Sts Sn £5 £2 TAGACGCAGG GCAGCTTGAG GGCCTTACTG GTGGCCGCAA 964 



864 
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CGAAATGACA CTCCCAATCA CTGTATTAGT TCTTGTACAT AATTTCGTCA TCCCTCCAGG 1024 
GATTGTCACA TAAATGCAAT GAGGAACAAT GAGTAC 1060 

(2) INFORMATION FOR SEQ ID NO:8: ■ ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 305 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

M ^ ^2 Ser Ser Pro Leu Leu Pro Ser Ala Val Val Ala Ala Leu Pro 
-21 -20 -15 -io 

Val Leu Ala Leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr Trp Asp Cys 
~ 5 1 5 io 

Cys Lys Pro Ser Cys Gly Trp Ala Lys Lys Ala Pro Val Asn Gin Pro 
15 20 25 

Val Phe Ser Cys Asn Ala Asn Phe Gin Arg He Thr Asp Phe Asp Ala 
30 35 40 

Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys Ala Asp Gin 
45 50 55 

Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Leu Gly Phe Ala Ala Thr 
60 65 70 7 5 

Ser He Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala Cys Tyr Glu 
80 85 90 

Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met Val Val Gin 
95 100 105 

Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Phe Asp Leu Asn 
110 115 120 

He Pro Gly Gly Gly Val Gly He Phe Asp Gly Cys Thr Pro Gin Phe 
125 130 135 

Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly He Ser Ser Arg Asn Glu 
140 145 150 i5 5 

Cys Asp Arg Phe Pro Asp Ala Leu Lys Pro Gly Cys Tyr Trp Ara Phe 
1*0 165 * 170 

Asp Trp Phe Lys Asn Ala Asp Asn Pro Ser Phe Ser Phe Arg Gin Val 
175 180 185 

Gin Cys Pro Ala Glu Leu Val Ala Arg Thr Gly Cys Arg Arg Asn Asp 
190 195 200 

Asp Gly Asn Phe Pro Ala Val Gin He Pro Ser Ser Ser Thr Ser Ser 
205 210 215 

Pro Val Asn Gin Pro Thr Ser Thr Ser Thr Thr Ser Thr Ser Thr Thr 
220 225 230 235 

Ser Ser Pro Pro Val Gin Pro Thr Thr Pro Ser Gly Cys Thr Ala Glu 
240 245 250 
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Arg Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys Thr Thr Cys 
255 260 265 

Val Ala Gly Ser Thr Cys Thr Lys He Asn Asp Trp Tyr His Gin Cys 
270 275 280 

Leu 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATGAGGAGCT CCCTTGTGCT GTTCTTTGTC TCTGCGTCGA CGGCCTTGGC CAGTCCTATT 60 

CGTCGAGAGG TCTCGCAGGA TCTGTTTAAC CAGTTCAATC TCTTTGCACA GTATTCTGCA 120 

GCCGCATACT GCGGAAAAAA CAATGATGCC CCAGCTGGTA CAAACATTAC GTGCACGGGA 180 

AATGCCTGCC CCGAGGTAGA GAAGGCGGAT GCAACGTTTC TCTACTCGTT TGAAGACTCT 240 

GGAGTGGGCG ATGTCACCGG CTTCCTTGCT CTCGACAACA CGAACAAATT GATCGTCCTC 300 

TCTTTCCGTG GCTCTCGTTC CATAGAGAAC TGGATCGGGA ATCTTAACTT CGACTTGAAA 360 

GAAATAAATG ACATTTGCTC CGGCTGCAGG GGACATGACG GCTTCACTTC GTCCTGGAGG 420 

TCTGTAGCCG ATACGTTAAG GCAGAAGGTG GAGGATGCTG TCAGGGAGCA TCCCGACTAT 480 

CGCGTGGTGT TTACCGGACA TAGCTTGGGT GGTGCATTGG CAACTGTTGC CGGAGCAGAC 540 

CTGCGTGGAA ATGGGTATGA TATCGACGTG TTTTCATATC GCGCCCCCCG AGTCGGAAAC 600 

AGGGCTTTTG CAGAATTCCT GACCGTACAG ACCGGCGGAA CACTCTACCG CATTACCCAC 660 

ACCAATGATA TTGTCCCTAG ACTCCCGCCG CGCGAATTCG GTTACAGCCA TTCTAGCCCA 720 

GAGTACTGGA TCAAATCTGG AACCCTTGTC CCCGTCACCC GAAACGATAT CGTCAAGATA 780 

GAAGGCATCG ATGCCACCGG CGGCAATAAC CAGCCTAACA TTCCGGATAT CCCTCCGCAC 840 
CTATGGTACT TCGGGTTAAT TGGGACATGT CTTTAG 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 291 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Arg Ser Ser Leu Val Leu Phe Phe Val Ser Ala Trp Thr Ala Leu 
15 10 15 

Ala Ser Pro lie Arg Arg Glu Val Ser Gin Asp Leu Phe Asn Gin Phe 
20 25 30 

Asn Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn 

25 
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35 40 45 

Asp Ala Pro Ala Gly Thr Asn lie Thr Cys Thr Gly Asn Ala Cys Pro • 
50 55 60 

Glu Val Glu Lys Ala Asp Ala Thr Phe Leu Tyr Ser Phe Glu Asp Ser 
65 70 75 80 

Gly Val Gly Asp Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys 
85 go 95 

Leu He Val Leu Ser Phe Arg Gly Ser Arg Ser He Glu Asn Trp He 
100 105 no 

Gly Asn Leu Asn Phe Asp Leu Lys Glu He Asn Asp He Cys Ser Gly 
115 120 125 

Cys Arg Gly His Asp Gly Phe Thr Ser Ser Trp Arg Ser Val Ala Asp 
130 135 140 

Thr Leu Arg Gin Lys Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr 
145 150 155 160 

Arg Val Val Phe Thr Gly His Ser Leu Gly Gly Ala. Leu Ala Thr Val 
165 170 175 

Ala Gly Ala Asp Leu Arg Gly Asn Gly Tyr Asp He Asp Val Phe Ser 
180 185 190 

Tyr Gly Ala Pro Arg Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr 
195 200 205 

Val Gin Thr Gly Gly Thr Leu Tyr Arg He Thr His Thr Asn Asp He 
" 4AU 215 220 

Val Pro Arg Leu Pro Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro 
225 230 235 240 

Glu Tyr Trp He Lys Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp 
245 250 255 

He Val Lys lie Glu Gly He Asp Ala Thr Gly Gly Asn Asn Gin Pro 
260 265 270 

Asn He Pro Asp He Pro Ala His Leu Trp Tyr Phe Gly Leu He Gly 
z/5 280 ~ 2 



285 



Thr Cys Leu 
290 



(2) INFORMATION FOR SEQ ID NO: 11:- 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 
•(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GCACACCATG GTCGCTGGAT CCATACCTTG TTGGAAGCGT CG 

(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH : 56 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi ) SEQUENCE DESCRIPTION : SEQ ID NO: 12: 
ATCGGAGCAT GCGGTACCGT TTAAACGAAT TCAGGTAAAC AAGATATAAT TTTCTG 56 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CTCTTGGATA TCTATCTCTT CACCATGCGT TCCTCCCCCC TCCT 44 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CAATAGAGGT GGCAGCAAAA 

20 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
ATCTATCTCT TCACCATGAG GAGCT 

25 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TAGATAGAGA AGTGGTACTC C 

21 



27 



WO 96/00787 
What is claimed is: 



PCT/US95/07743 



1. A non-toxic, non-toxigenic, non-pathogenic recombinant Fusarium host cell 
comprising a nucleic acid sequence encoding a heterologous protein operably linked to a 
promoter. 

2. The host cell of claim 1 in which the Fusarium is Fusarium graminearum. 

3. The host cell of claim 1 in which the Fusarium graminearum has the 
identifying characteristics of ATCC 20334. 

4. The host cell of claim 1 in which the heterologous protein is a fungal 

protein. 

5. The host cell of claim 1 in which the heterologous protein is a secreted 

protein. 

6. The host cell of claim 1 in which the heterologous protein is a fungal 

enzyme. 

7. The host cell of claim 6 in which the fungal enzyme is selected from the 
group consisting of a catalase, laccase, phenoloxidase, oxidase, oxidoreductases, cellulase, 
xylanase, peroxidase, lipase, hydrolase, esterase, cutinase, a proteolytic enzyme, 
aminopeptidase, carboxypeptidase, phytase, lyase, apectinolytic enzymes, amylase, 
glucoamylase, a-galactosidase, p-galactosidase, cc-glucosidase, P-glucosidase, 

mannosidase, isomerase, invertase, transferase, ribonuclease, chitinase, and 
deoxyribonuclease. 

8. The host cell of claim 6 in which the fungal enzyme is a protease. 

9. The host cell of claim 6 in which the fungal enzyme is an alkaline protease. 

10. The host cell of claim 9 in which the alkaline protease is a Fusarium 
oxysporum tryp sin-like protease. 

1 1. The host cell of claim 10 in which the Fusarium oxysporum trypsin-like 
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protease has an amino sequence shown in SEQ ID NO:4. 

12. The host cell of claim 6 in which the fungal enzyme is an endoglucanase 
or variant thereof. 

13. The host cell of claim 14 in which the endoglucanase has an amino acid 
sequence shown in SEQ ID NO:8. 

14. The host cell of claim 6 in which the fungal enzyme is a 13 lipase or 

variant thereof. 

15. The host cell of claim 12 in which the 1,3 lipase has an amino acid 
sequence shown in SEQ ID NO: 10. 

16. The host cell of claim 1 in which the heterologous protein is selected from 
the group consisting of a hormone, a growth factor and a receptor. 

17. The host cell of claim 1 in which the promoter is a fungal promoter. 

18. The host cell of claim 17 in which the promoter is selected from the group 
consisting of the promoters from A. nidulans amdS. 

19. The host cell of claim 17 in which said fungal promoter is derived from a 
gene encoding a Fusarium oxysporum trypsin-like protease or a fragment thereof having 
substantially the same promoter activity as said sequence. 

20. The host cell of claim 19 in which said promoter sequence is shown in 

SEQIDNO:5. 

21. The host cell of claim 1 which also comprises a selectable marker. 

22. The host cell of claim 13 in which the marker is selected from the group 
consisting of argB, trpC, pyrG, amdS, niaD and hygB. 

23. The host cell of claim 1 which also comprises a terminator. 

24. The host cell of claim 23 in which said terminator is derived from a gene 
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encoding a Fusarium oxysporum trypsin-like protease or a fragment thereof having 
substantially the same terminator activity as said sequence. 

25. The host cell of claim 24 in which said terminator sequence is shown in 
SEQIDNO:6. - 

26. A method for producing a protein of interest which comprises culturing a 
non-toxic, non-toxigenic, non-pathogenic recombinant Fusarium host cell comprising a nucleic 
acid sequence encoding a said protein operably linked to a promoter and isolating said protein. 

27. A promoter sequence derived from a gene encoding a Fusarium oxysporum 
trypsin-like protease or a fragment thereof having substantially the same promoter activity as 
said sequence in which said promoter has the sequence shown in SEQ ID NO:5. 

28. A terminator sequence derived from a gene encoding a Fusarium 
oxysporum trypsin-like protease or a fragment thereof having substantially the same terminator 
activity as said sequence in which said terminator has the sequence shown in SEQ ID NO:6. 
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